--- layout: post title: Geospatial Analysis in R excerpt: This tutorial demonstrates the mapping functionality of R - as presented by MAJ Dave Beskow at the 1st DSCOE Event. ---
In 2005 a group of R developers created the R package sp to extend R with classes and methods for spatial data (Pebesma and Bivand, 2005). Since then, hundreds of packages have been created to assist in analyzing and visualizing spatial data. Given the myriad of GIS software that already exists, created by ESRI as well as many other companies, what is the advantage of conducting geospatial analysis in R? Here’s a comparison of GIS and R provided by Robert Hijmans (UC Davis):
| GIS | R |
|---|---|
| Visual interaction | Data & model focused |
| Data management | Analysis |
| Geometric operations | Attributes as important |
| Standard workflows | Creativity & innovation |
| Single map production | Many (simpler) maps |
| Click, click, click, & click | Repeatability (single script) |
| Speed of execution | Speed of development |
| Typically expensive | Powerfull and free |
This short class is designed to introduce geospatial analysis in R. To do this we will focus on the building blocks as well as some packages that facilitate easy analysis and visualization. The primary topics that I will focus on are:
If you need to conduct a quick review of R, I recommend that you look at two introductory lessons that we’ve developed for West Point cadets:
Before you begin conducting spatial analysis in R, make sure that you install three necessary packages: ggmap, Rcpp, and sp. This can be done in the packages tab in the lower right quadrant of RStudio. It can also be done on the command line by typing install.packages('ggmap') , install.packages('Rcpp'), and install.packages('sp')
Point data is the simplest type of geospatial data. Spatial point data is used represent the spatial nature of events. Examples of point data include the location of a customer’s iPhone purchases in business, the location of a crime in law enforcement, the location of attacks in the military, or the location of infrastructure in engineering. Point data means that an entity is represented by an x coordinate and a y coordinate in space. In data, this is most often done by recording a longitude and latitude. We are going to import a data set that you should already be familiar with: the Houston Crime data set. You can import this data with the command:
library(ggmap)
library(Rcpp)
data(crime) ##After running this commend, you shoud see 'crime' in your working environment.
Now let’s explore the data. Let’s first see what the column names are:
names(crime)
## [1] "time" "date" "hour" "premise" "offense" "beat"
## [7] "block" "street" "type" "suffix" "number" "month"
## [13] "day" "location" "address" "lon" "lat"
We see that we have some time fields, type of offense fields, as well as some spatial fields. In addition to the street address, we actually have the longitude and the latitude. We can get a little more information on the columns with the command:
str(crime)
## 'data.frame': 86314 obs. of 17 variables:
## $ time : POSIXt, format: "2010-01-01 01:00:00" "2010-01-01 01:00:00" ...
## $ date : chr "1/1/2010" "1/1/2010" "1/1/2010" "1/1/2010" ...
## $ hour : int 0 0 0 0 0 0 0 0 0 0 ...
## $ premise : chr "18A" "13R" "20R" "20R" ...
## $ offense : Factor w/ 7 levels "aggravated assault",..: 4 6 1 1 1 3 3 3 3 3 ...
## $ beat : chr "15E30" "13D10" "16E20" "2A30" ...
## $ block : chr "9600-9699" "4700-4799" "5000-5099" "1000-1099" ...
## $ street : chr "marlive" "telephone" "wickview" "ashland" ...
## $ type : chr "ln" "rd" "ln" "st" ...
## $ suffix : chr "-" "-" "-" "-" ...
## $ number : int 1 1 1 1 1 1 1 1 1 1 ...
## $ month : Ord.factor w/ 8 levels "january"<"february"<..: 1 1 1 1 1 1 1 1 1 1 ...
## $ day : Ord.factor w/ 7 levels "monday"<"tuesday"<..: 5 5 5 5 5 5 5 5 5 5 ...
## $ location: chr "apartment parking lot" "road / street / sidewalk" "residence / house" "residence / house" ...
## $ address : chr "9650 marlive ln" "4750 telephone rd" "5050 wickview ln" "1050 ashland st" ...
## $ lon : num -95.4 -95.3 -95.5 -95.4 -95.4 ...
## $ lat : num 29.7 29.7 29.6 29.8 29.7 ...
Now that we’ve learned a little bit about the data, let’s explore the spatial component of our data. To do this, I usually start by plotting it against a plane white background as simple x and y coordinates. To do this in R, we simply type:
plot(crime$lon,crime$lat,xlab="lon",ylab="lat")
This plot shows us immediately that we have some outliers that seem to be pretty far away from Houston, Texas.
Now let’s showcase the power of the ggmap package by plotting this same data on an image from Google Maps. First, let’s learn how to get a map from Google Maps. The following code gets and plots a map of Houston from Google Maps:
qmap("houston", zoom = 13) #This command retrieves and plots a map of Houston
Note that if we change the zoom to 6, we would get a map of most of Texas centered on Houston:
qmap("houston", zoom = 6) #Change zoom
Now, in order to add points to this plot, we need need to store this image in an object and add to it. Plotting with the ggmap package is done with “layers.” You first plot the map and then you can add layers of other geospatial objects and/or text. We do this with the code below:
HoustonMap<-qmap("houston", zoom = 14) #First, get a map of Houston
HoustonMap+ ##This plots the Houston Map
geom_point(aes(x = lon, y = lat), data = crime) ##This adds the points to it
Note here that we assign the column name lon to the x variable, the column name lat as the y variable, and we tell R to get these field names from the crime data frame. If we wanted to color the points by the type offense, we could do this with the command:
HoustonMap+
geom_point(aes(x = lon, y = lat, colour = offense),
data = crime)
Note here that offense is the name of the column that contains the categorical variable that we want to use to distinguish between the color of our points.
HoustonMap+
geom_point(aes(x = lon, y = lat, colour = offense),
data = crime)+
ggtitle("Map of Crime in Houston")
Also note that we can easily save this image as a file by wrapping our script with two commands. These commands allow us to print our graphic to a file in our working directory instead of printing it to the screen. Here I’ve given a method to print to a .png graphics file:
png("myHoustonMap.png") ##This is the name of the image file (in this case a .png file)
HoustonMap+
geom_point(aes(x = lon, y = lat, colour = offense),
data = crime)+
ggtitle("Map of Crime in Houston")
dev.off() ##This command indicates that we're done creating our plot. It finalizes and closes the .png file.
There are many times when you are given data that has a location field, but it doesn’t have a column for latitude and longitude. You may have a field for address, city, or other text describing the location, but you don’t have the latitude and longitude. In these cases, the ggmap package provides a way for you to geocode these locations. The geocode function works by pinging Google Maps as if you were searching it on a web browser, and returning the lat and lon for a given location. For example, we can write:
geocode("West Point, NY")
## lon lat
## 1 -73.95597 41.39148
and note that this returns the latitude and longitude for West Point, NY. You can also enter a full address. Here we’ll enter the address for the White House:
geocode('1600 Pennsylvania Avenue Northwest, Washington, DC 20500')
## lon lat
## 1 -77.03648 38.89768
Try entering your home address. You can use this to get the lat and lon information so that you can plot your spatial data. For example, let’s say that you wanted to create a map of your squad. You could create a squad data frame that looks like this:
mySquad<-data.frame(
Names=c("Bill","Dave","Liz","Cody","Austin","Amber","Jimmy","Mike","Amanda"),
Location=c("Boston, MA","Chicago, IL","Atlanta, GA","Portland, OR","Albuquerque, NM",
"Dallas, TX","St Louis, MO","Miami, FL","San Francisco, CA"),stringsAsFactors=FALSE)
Here’s what our data looks like.
mySquad
## Names Location
## 1 Bill Boston, MA
## 2 Dave Chicago, IL
## 3 Liz Atlanta, GA
## 4 Cody Portland, OR
## 5 Austin Albuquerque, NM
## 6 Amber Dallas, TX
## 7 Jimmy St Louis, MO
## 8 Mike Miami, FL
## 9 Amanda San Francisco, CA
Let’s add a lat and lon column to this data. We can do this by geocoding in a for loop like this:
for(i in 1:9){
temp<-geocode(mySquad$Location[i]) #Geocode each location and store it in temp
mySquad$lon[i]<-temp$lon #Assign the lon
mySquad$lat[i]<-temp$lat #Assign the lat
}
Note that our data now has a spatial aspect that we can use for analysis:
mySquad
## Names Location lon lat
## 1 Bill Boston, MA -71.05888 42.36008
## 2 Dave Chicago, IL -87.62980 41.87811
## 3 Liz Atlanta, GA -84.38798 33.74900
## 4 Cody Portland, OR -122.67648 45.52306
## 5 Austin Albuquerque, NM -106.60555 35.08533
## 6 Amber Dallas, TX -96.79699 32.77666
## 7 Jimmy St Louis, MO -90.19940 38.62700
## 8 Mike Miami, FL -80.19179 25.76168
## 9 Amanda San Francisco, CA -122.41942 37.77493
We can now plot our squad
US<-qmap("United States",zoom=4,color="bw")
US+
geom_point(aes(x = lon, y = lat),color="red",data = mySquad)+
annotate('text', x=mySquad$lon, y=mySquad$lat+1, label = mySquad$Names, colour = I('red'), size = 4)+
ggtitle("My Squad")
Shapefiles are a popular method of representing geospatial vector data in GIS software. This file format was developed by ESRI (a notable GIS Software Company). The shapefile format can store data in the form of points, lines, or polygons. Shapefiles are used to represent adminsistrative boundaries (think country/state/county borders), roads, rivers, lakes, etc.
The raster package in R makes it very easy to download various administrative boundaries for any country. You just have to know the three letter ISO Code for the country of interest (you can look this up on Google). Below we demonstrate how to load the raster library, and get the province level administrative boundaries for France.
library(raster) ##Load the Raster Library
france<-getData('GADM', country='FRA', level=1) ##Get the Province Shapefile for France
plot(france) ##Plot this shapefile
Note that level 1 detail give the boundaries one level below the national boundaries. Actually, level 0 gives only national boundaries, level 1 gives the equivalent of province level boundaries, and each susequent level gives finer and finer granularity. In the United States, level 0 give our national boundary, level 1 gives state boundaries, and level 2 adds county boundaries. You can see level 0 through level 3 for France in the following plot.
Below we will retrieve and plot the county level shapefile for the United States. Note that when we plot this, that the continental United States is extremely small since it also plots Alaska, Hawaii, and some other islands that belong to the US.
us<-getData('GADM', country='USA', level=2) #Get the County Shapefile for the US
plot(us)
Now let’s explore this data a little further. If we check the class of the us object (by typing class(us)), we will see that it is a SpatialPolygonsDataFrame. This means that it is multiple polygons that are held together in a data frame. We can access each of these polygons in a similar way that we access elements of a normal data frame. Now let’s look at the names of this data frame:
names(us)
## [1] "PID" "ID_0" "ISO" "NAME_0" "ID_1"
## [6] "NAME_1" "ID_2" "NAME_2" "NL_NAME_2" "VARNAME_2"
## [11] "TYPE_2" "ENGTYPE_2"
Notice there are various types of names in this data frame. Name 0 represents national boundaries, Name 1 represents state names, and Name 2 represents county names. We can see this further if we plot all of the unique NAME_1:
unique(us$NAME_1)
## [1] "Georgia" "Alabama" "Alaska"
## [4] "Arizona" "Arkansas" "California"
## [7] "Colorado" "Connecticut" "Delaware"
## [10] "District of Columbia" "Florida" "Hawaii"
## [13] "Idaho" "Illinois" "Indiana"
## [16] "Iowa" "Kansas" "Kentucky"
## [19] "Louisiana" "Maine" "Maryland"
## [22] "Massachusetts" "Michigan" "South Dakota"
## [25] "Tennessee" "Texas" "West Virginia"
## [28] "Minnesota" "Mississippi" "Missouri"
## [31] "Montana" "Nebraska" "Nevada"
## [34] "New Hampshire" "New Jersey" "New Mexico"
## [37] "New York" "North Carolina" "North Dakota"
## [40] "Ohio" "Oklahoma" "Oregon"
## [43] "Pennsylvania" "Rhode Island" "South Carolina"
## [46] "Utah" "Vermont" "Virginia"
## [49] "Washington" "Wisconsin" "Wyoming"
Below we demonstrate how to subset a SpatialPolygonsDataFrame. Note here that if we wanted to only plot the Northwestern United States, we could use the following code to create a subset that only contains Washington, Oregon, and Idaho.
northwest<-subset(us,NAME_1=="Washington" | NAME_1=="Oregon" | NAME_1=="Idaho")
plot(northwest)
Now let’s create a subset that only contains Texas.
texas<-subset(us,NAME_1=="Texas")
plot(texas)
Now we’ll demonstrate how to plot point data on top of a shapefile. If you start by plotting the shapefile, you only have to call the points command in order to plot the point data on top of the shapefile.
library(ggmap) ##load the ggmap package so we can access the crime data
data(crime) ##load the crime data
plot(texas) ##plot texas
points(crime$lon,crime$lat,col="red",pch=16) ##Add the crime data to the map
Here we can see clearly that some of our points are a long way from Houston.
Another great way to show spatial data is with a choropleth map (sometimes called a heat map). A choropleth map shades a SpatialPolygonDataFrame by some measurement of spatial data. One very useful way to shade a map is by the density of spatial points found in each polygon. For example, we could shade the map of Texas with the density of crime data in the crime data set. Before we do this, let me first give you a mapping function that I created for this purpose back in 2011. I am not going to explain all of the computations that are performed in this code, but you will need to copy and paste this code into your console in order for this function to be available to you (you can also source this from an R script file if that is easier). You will need to load the sp and RColorBrewer packages before using this code.
heatMap <-function(data,shape=NULL,col="blue",main="Sample HeatMap"){
# Plots a Heat Map of a Polygons Data Frame. This will
# demonstrate density within a finite set of polygons
#
# Args:
# data: Spatial Points dataframe
# shape: Polygons Data Frame
#
#
# Notes: This function requires the sp and RColorBrewer
# Packages
#
# Beskow: 03/28/11
#
is.installed <- function(mypkg) is.element(mypkg,
installed.packages()[,1])
if (is.installed(mypkg="sp")==FALSE) {
stop("sp package is not installed")}
if (is.installed(mypkg="RColorBrewer")==FALSE) {
stop("RColorBrewer package is not installed")}
if (!class(data)=="SpatialPointsDataFrame") {
stop("data argument is not SpatialPointsDataFrame")}
require(sp)
require(RColorBrewer)
freq_table<-data.frame(tabulate(over(as(data,"SpatialPoints"),
as(shape,"SpatialPolygons")),nbins=length(shape)))
names(freq_table)<-"counts"
shape1<-spChFIDs(shape,as.character(1:length(shape)))
row.names(as(shape1,"data.frame"))
spdf<-SpatialPolygonsDataFrame(shape1, freq_table, match.ID = TRUE)
rw.colors<-colorRampPalette(c("white",col))
spplot(spdf,scales = list(draw = TRUE),
col.regions=rw.colors(max(freq_table)), main=main)
}
Now that we have this function read into our Global Environment, we can use it. Before we use it, we need to create an SpatialPointsDataFrame with our crime data. This is done below (after loading the sp and RColorBrewer libraries).
library(sp)
library(RColorBrewer)
## Warning: package 'RColorBrewer' was built under R version 3.1.3
crime.sp<-crime ##create a new object that we will coerce to a SpatialPointsDataFrame
crime.sp<-crime.sp[!is.na(crime.sp$lat),] ##Get rid of all NA's in the location field
coordinates(crime.sp)<-c("lon","lat") ##Assigning coordinates coerces this to a SpatialPointsDataFrame
proj4string(crime.sp)<-proj4string(texas) ##Assigns the texas spatial datum to our new data frame
We can check our new data set by running the code:
class(crime.sp) #Prints the class of an object
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
This should show us that crime.sp is a SpatialPointsDataFrame. Now that we’ve checked this, we just have to run a simple command that will create our choropleth.
heatMap(crime.sp,texas,col="red",main="Houston Crime Data Denstiy")
This map isn’t very interesting, mainly because the data is primarily centered on a single county. A more interesting plot would be to plot the protests in France following the Charlie Hebdo shooting in January, 2015.
heatMap(protests.sp,france,col="blue", main="Protests in France, 15JAN15-31JAN15")
Note here that most of the protests are located away from Paris. Also note that Southwest France has see more than 1600 protests over these 15 days.
In this lesson we are going to learn how to create a contour heatmap of spatial point data as well as create pie plots positioned on the centroid of a polygon.
A contoured heat map is another way to demonstrate data density, especially if you do not have or do not desire to create a heatmap over a shapefile (like the country’s provinces). The contour heatmap uses two-dimentsional kernel density estimation in order to create the contoured heatmap of the spatial point data. Think of this as a histogram in two dimensions
Below we demonstrate how to do this using the crime data
library(ggmap) ##load the ggmap package so we can access the crime data
data(crime) ##load the crime data
We will plot the contour map over a ggmap:
houston <- get_map(location = "houston", zoom = 13) ##Get the houston map
houstonMap<-ggmap(houston, extent = "device") ##Prepare Map
houstonMap +
stat_density2d(aes(x = lon, y = lat, fill = ..level..,alpha=..level..), bins = 10, geom = "polygon", data = crime) +
scale_fill_gradient(low = "black", high = "red")+
ggtitle("Map of Crime Density in Houston")
Once you do this, play with the number of bins to see how this variable will change the look of the contours. Try 5, 10, and 15. You can also experiment with changing the color of the heat map.
Often times we want to understand the spatial nature of a categorical variable. One way to do this is to position multiple plots over their respective geographical area. To demonstrate this, we will load an unclassified data set about attacks in Afghanistan. This comes from the Worldwide Incidents Tracking System (WITS database) which is maintained by the National Counterterrorism Center. This data is available to download from CIS as a csv file.
Copy and paste the function below into your console (or source it from a file). This will load the Pie over Polygon procedure.
plotPiePoly <-function(data,shape,factor,main="Pie Plot",
size=c(1,1),legend=TRUE,legend.pos="bottomright"){
# Plots a Pie Plot over over Polygons in a SpatialPolygonDataFrame
#
# Args:
# data: Spatial Points dataframe
# shape: Shapefile
# factor.column: The column in the SpatialPointsDataFrame
# main: the title of the graph
# size: The size of the pieplot in inches
#
# Notes: This function requires the sp package.
#
# Beskow: 03/30/11
#
is.installed <- function(mypkg) is.element(mypkg,
installed.packages()[,1])
if (is.installed(mypkg="sp")==FALSE) {
stop("sp package is not installed")}
if (is.installed(mypkg="TeachingDemos")==FALSE) {
stop("TeachingDemos package is not installed")}
if (!class(data)=="SpatialPointsDataFrame") {
stop("data argument is not SpatialPointsDataFrame")}
require(sp)
require(TeachingDemos)
names(data)[grep(factor,names(data))]<-"factor"
data.table<-table(data$factor)
data.table<-sort(data.table[data.table>0],decreasing=TRUE)
my.names<-names(data.table)
palette(rainbow(length(my.names)))
plot(shape)
if(legend){
legend(legend.pos, my.names, col=c(2:(length(my.names)+1)),
text.col = "green4", pch = c(15), bg = 'gray90',cex=0.8)}
for (i in 1:length(shape)){
x<-!is.na(over(as(data,"SpatialPoints"),as(shape[i,],
"SpatialPolygons")))
if(sum(x)>0){
my.table<-table(as.factor(data$factor[x]))
my.table<-my.table[my.table>0]
table.Colors<-match(names(my.table),my.names)+1
subplot(pie(my.table,col=table.Colors,labels=NA),
coordinates(shape[i,])[1],
coordinates(shape[i,])[2],size=size)
}
}
palette("default")
mtext(main,cex=2)
}
Now that you have the function in your working environment, you can load the relevant packages, read in the WITS data, and create your plot. Note that this plot looks better if you “zoom” in on it.
##Load Relevant Packages
library(sp)
library(TeachingDemos)
## Warning: package 'TeachingDemos' was built under R version 3.1.3
library(raster)
##Get the Afghanistan Shapefile
afg<-getData('GADM', country='AFG', level=1) #Get the Province Shapefile for France
##Load the WITS data from CSV
wits_sp<-read.csv("wits_sp.csv",as.is=TRUE)
##Create the SpatialPointsDataFrame
coordinates(wits_sp)<-c("lon","lat")
proj4string(wits_sp)<-proj4string(afg)
##Now all you have to do is plot it!
plotPiePoly(data = wits_sp,
shape=afg,
factor = "flag",
size = c(0.5, 0.5),
legend = TRUE,
main = "Sample Pieplot Over Polygon (WITS Data)")
This allows you to visualize the spatial nature of a categorical variable.
In this lesson we are going to learn how to create a contour heatmap of spatial point data as well as create pie plots positioned on the centroid of a polygon.
A contoured heat map is another way to demonstrate data density, especially if you do not have or do not desire to create a heatmap over a shapefile (like the country’s provinces). The contour heatmap uses two-dimentsional kernel density estimation in order to create the contoured heatmap of the spatial point data. Think of this as a histogram in two dimensions
Below we demonstrate how to do this using the crime data
library(ggmap) ##load the ggmap package so we can access the crime data
data(crime) ##load the crime data
We will plot the contour map over a ggmap:
houston <- get_map(location = "houston", zoom = 13) ##Get the houston map
houstonMap<-ggmap(houston, extent = "device") ##Prepare Map
houstonMap +
stat_density2d(aes(x = lon, y = lat, fill = ..level..,alpha=..level..), bins = 10, geom = "polygon", data = crime) +
scale_fill_gradient(low = "black", high = "red")+
ggtitle("Map of Crime Density in Houston")
Once you do this, play with the number of bins to see how this variable will change the look of the contours. Try 5, 10, and 15. You can also experiment with changing the color of the heat map.
Often times we want to understand the spatial nature of a categorical variable. One way to do this is to position multiple plots over their respective geographical area. To demonstrate this, we will load an unclassified data set about attacks in Afghanistan. This comes from the Worldwide Incidents Tracking System (WITS database) which is maintained by the National Counterterrorism Center. This data is available to download from CIS as a csv file.
Copy and paste the function below into your console (or source it from a file). This will load the Pie over Polygon procedure.
plotPiePoly <-function(data,shape,factor,main="Pie Plot",
size=c(1,1),legend=TRUE,legend.pos="bottomright"){
# Plots a Pie Plot over over Polygons in a SpatialPolygonDataFrame
#
# Args:
# data: Spatial Points dataframe
# shape: Shapefile
# factor.column: The column in the SpatialPointsDataFrame
# main: the title of the graph
# size: The size of the pieplot in inches
#
# Notes: This function requires the sp package.
#
# Beskow: 03/30/11
#
is.installed <- function(mypkg) is.element(mypkg,
installed.packages()[,1])
if (is.installed(mypkg="sp")==FALSE) {
stop("sp package is not installed")}
if (is.installed(mypkg="TeachingDemos")==FALSE) {
stop("TeachingDemos package is not installed")}
if (!class(data)=="SpatialPointsDataFrame") {
stop("data argument is not SpatialPointsDataFrame")}
require(sp)
require(TeachingDemos)
names(data)[grep(factor,names(data))]<-"factor"
data.table<-table(data$factor)
data.table<-sort(data.table[data.table>0],decreasing=TRUE)
my.names<-names(data.table)
palette(rainbow(length(my.names)))
plot(shape)
if(legend){
legend(legend.pos, my.names, col=c(2:(length(my.names)+1)),
text.col = "green4", pch = c(15), bg = 'gray90',cex=0.8)}
for (i in 1:length(shape)){
x<-!is.na(over(as(data,"SpatialPoints"),as(shape[i,],
"SpatialPolygons")))
if(sum(x)>0){
my.table<-table(as.factor(data$factor[x]))
my.table<-my.table[my.table>0]
table.Colors<-match(names(my.table),my.names)+1
subplot(pie(my.table,col=table.Colors,labels=NA),
coordinates(shape[i,])[1],
coordinates(shape[i,])[2],size=size)
}
}
palette("default")
mtext(main,cex=2)
}
Now that you have the function in your working environment, you can load the relevant packages, read in the WITS data, and create your plot. Note that this plot looks better if you “zoom” in on it.
##Load Relevant Packages
library(sp)
library(TeachingDemos)
library(raster)
##Get the Afghanistan Shapefile
afg<-getData('GADM', country='AFG', level=1) #Get the Province Shapefile for France
##Load the WITS data from CSV
wits_sp<-read.csv("wits_sp.csv",as.is=TRUE)
##Create the SpatialPointsDataFrame
coordinates(wits_sp)<-c("lon","lat")
proj4string(wits_sp)<-proj4string(afg)
##Now all you have to do is plot it!
plotPiePoly(data = wits_sp,
shape=afg,
factor = "flag",
size = c(0.5, 0.5),
legend = TRUE,
main = "Sample Pieplot Over Polygon (WITS Data)")
This allows you to visualize the spatial nature of a categorical variable.
Google Earth is an extremely versatile tool for understanding geospatial data. Google Earth uses KML files to represent geospatial data. R will allow us to write our geospatial data to a KML file. To do this, you will have to install the maptools and the rgdal libraries. The code for writing a SpatialPointsDataFrame to a KML file is given below:
library(ggmap)
library(maptools)
library(rgdal)
data(crime) #Load the Crime Data Set
crime<-crime[!is.na(crime$lat),] #Get rid of the NA's in the crime data set
crime<-crime[1:100,] #Get only the first 100 records
coordinates(crime)<-c("lon","lat") #Build a SpatialPointsData Frame
proj4string(crime)<-CRS("+proj=longlat +datum=WGS84")
##The two lines below convert the month and day columns to character dat
##(both of these line are originally 'factor' data, which is not compatible)
crime$month<-as.character(crime$month)
crime$day<-as.character(crime$day)
##The one line of code below writes our SpatialPointsDataFrame to a KML File
writeOGR(crime, dsn="crime.kml", layer= "crime", driver="KML")
Now we just have to go to our working directory and double click on the KML file to open it in Google Earth (assuming that Google Earth is already installed on your computer)